Our group endeavors to demystify the intricacies of the New York City housing landscape, providing a thorough analysis that spans the diverse boroughs and neighborhoods of this vibrant metropolis. Utilizing a rich dataset, we aim to unravel the variegated tapestry of real estate prices, offering insights that cater to potential homebuyers, sellers, and market enthusiasts alike.
NYS Retail Food Stores: This is a dataset hosted by the State of New York. The state has an open data platform found here and they update their information according the amount of data that is brought in. Explore New York State using Kaggle and all of the data sources available through the State of New York!
https://www.kaggle.com/datasets/new-york-state/nys-retail-food-stores
New York Housing Market: This dataset contains prices of New York houses, providing valuable insights into the real estate market in the region. It includes information such as broker titles, house types, prices, number of bedrooms and bathrooms, property square footage, addresses, state, administrative and local areas, street names, and geographical coordinates.
https://www.kaggle.com/datasets/nelgiriyewithana/new-york-housing-market/data
Mengyan Xu Email: mx2283@columbia.edu
Annika Xu Email: jx2552@columbia.edu
Cynthia Chen Email: yc4336@columbia.edu
We began our analysis by using a heatmap to visualize the relationship between the property square feet, number of bedrooms, number of bathrooms and its price. As indicated in the graph, all variables show a positive correlation relationship between any two variables. Among these, number of baths and number of beds have strongest correlation. Moreover, it becomes evident that properties with a greater number of bedrooms and bathrooms correspond to larger square footage and command higher prices.
When analyzing property values, understanding the incremental average price contributions for different house types is crucial. The waterfall chart above illustrates the incremental average price contributions for various house types. Each blue box represents a specific house type, and its size corresponds to the incremental contribution to the average price. From the graph, we can conclude that major house types like ‘contingent’, ‘foreclosure’, ‘multi-family home for sale’, and ‘townhouse for sale’ significantly contribute to the average price.
From the bar chart above, we can conclude that ‘Townhouse for sale’ has the highest selling price, while ‘co-op for sale’ has the lowest. Furthermore, ‘Townhouse for sale’ exhibits the widest price range, while ‘Contingent’, ‘Foreclosure’, and ‘Co-op for sale’ have comparatively narrower price ranges. Notably, ‘Co-op for sale’, ‘Condo for sale’, ‘House for sale’, and ‘Multi-family for sale’ display a higher number of outliers.
This interactive leaflet map illustrates the geographic distribution of establishments available in all five boroughs of New York City. Individual establishments are first printed out in blue circles with pop-up window indicating the address, establishment type and the establishment’s price per square foot. When zooming out, one could see the agglomerates with a number indicating how many establishments there are in certain area. This visualization helps identify the density of accommodation resources within New York City and adding the interactivity allow users to freely navigate amongst different neighborhoods.
This interactive leaflet map is a choropleth map which is shaded with median price per square foot in each USPS zip codes blocks. The color scheme is from bright yellow to dark purple, which brighter color indicate higher median price per square foot. This visualization provides an overview of New York City’s house market price level on a smaller spectrum–by neighborhood. The interactive feature allow users could zoom in and select specific neighborhoods to check the exact median price.
We are also interested in examining if there’s correlation between number of retail stores and the house price. This interactive leaflet map function similarly as the previous one, mapping out individual retail establishments and showing agglomerates once zoomed out.
This interactive leaflet map is a choropleth map which is shaded with the number of retail establishment in each USPS zip codes blocks. The color scheme is from bright yellow to dark purple, which brighter color indicate more retails. This visualization provides an overview of New York City’s retail activity on a smaller spectrum–by neighborhood. The interactive feature allow users could zoom in and select specific neighborhoods to check the exact number of retailers.
To delve into the factors influencing divergent housing prices across neighborhoods, we explored whether the accessibility of retail stores, such as Starbucks or Dunkin’ Donuts or other stores, might impact housing values. We hypothesized that neighborhoods boasting greater retail convenience might command higher housing prices. To test this hypothesis, we used an additional dataset (retail store dataset), and focused on the New York City area as an example. For each neighborhood (zip code), we computed the average housing price alongside the number of retail stores present.
Surprisingly, our analysis did not reveal a distinctly clear correlation between these variables. However, interestingly, certain zip code areas, like 10023 and 10024, exhibited a trend where fewer retail food stores coincided with lower neighborhood prices.
This bar chart titled “Property Type Counts” showcases the frequency of different property types available for sale. At first glance, it’s apparent that the most common type is the ‘House for sale’, towering over the others with the highest count. Following that, ‘Condo for sale’ and ‘Multi-family home for sale’ are also quite prevalent in the market. On the lower end of the spectrum, there are categories like ‘Mobile home for sale’ and ‘Pending home for sale’, which are less numerous.
We find it interesting to note the diversity of property types, indicating a vibrant and varied real estate market. This chart effectively communicates the breakdown of property types, providing a clear visual representation that can be easily interpreted at a glance. It’s a useful tool for quickly assessing which types of properties dominate the market and could inform decisions for both real estate professionals and potential buyers.
In this scatter plot, we’ve visualized the relationship between house prices and property square footage, as managed by different real estate brokers. Each point on the graph represents a property, with the horizontal axis showing its size in square feet and the vertical axis showing its price. The color of each point corresponds to a specific broker, allowing me to see at a glance how the property size relates to its price across various brokers.
From this graph, we can observe that certain brokers, like those represented by the clusters of points towards the bottom, have a range of properties at varying sizes and prices, while others may specialize more in either higher or lower-priced markets, regardless of the size. For instance, some brokers seem to handle more expansive properties, as indicated by the larger square footage values, while others have properties that are smaller but vary widely in price.
By making this plot interactive with plotly, we enable the viewer to hover over individual points to get more detailed information, such as the exact size, price, and the broker listing the property. This level of interactivity makes the plot a powerful tool for interactive data exploration and provides valuable insights into the real estate market’s pricing strategies from a broker.
Here we have a heatmap titled “House Price Distribution Heatmap by Broker,” which I’ve created to analyze and visualize the property pricing strategies of different real estate brokers. This chart categorizes properties based on their price range, displayed on the y-axis, and the corresponding brokers on the x-axis.
Each cell’s color intensity reflects the number of properties that fall within a specific price range for a given broker. Darker shades of blue indicate a higher concentration of properties within a price range, as denoted by the color scale on the right. For example, you can see that some brokers have a higher concentration of properties in particular price ranges, shown by the dark blue tiles.
This visual representation allows me to quickly identify patterns and outliers in the pricing strategies of these brokers. It’s clear that some brokers tend to handle properties within a narrower price range, while others have listings spread across a broader spectrum of prices. This insight could be pivotal for potential buyers who are targeting specific price ranges, as well as for market analysts interested in the competitive positioning of brokers in the real estate market.